DeCOM: Decomposed Policy for Constrained Cooperative Multi-Agent Reinforcement Learning

نویسندگان

چکیده

In recent years, multi-agent reinforcement learning (MARL) has presented impressive performance in various applications. However, physical limitations, budget restrictions, and many other factors usually impose constraints on a system (MAS), which cannot be handled by traditional MARL frameworks. Specifically, this paper focuses constrained MASes where agents work cooperatively to maximize the expected team-average return under costs, develops cooperative framework, named DeCOM, for such MASes. particular, DeCOM decomposes policy of each agent into two modules, empowers information sharing among achieve better cooperation. addition, with modularization, training algorithm separates original optimization an unconstrained reward satisfaction problem costs. then iteratively solves these problems computationally efficient manner, makes highly scalable. We also provide theoretical guarantees convergence DeCOM's update algorithm. Finally, we conduct extensive experiments show effectiveness types costs both moderate-scale large-scale (with 500 agents) environments that originate from real-world

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning in Cooperative Multi–Agent Systems

Reinforcement Learning is used in cooperative multi–agent systems differently for various problems. We provide a review on learning algorithms used for repeated common–payoff games, and stochastic general– sum games. Then these learning algorithms is compared with another algorithm for the credit assignment problem that attempts to correctly assign agents the awards that they deserve.

متن کامل

Argumentation Accelerated Reinforcement Learning for Cooperative Multi-Agent Systems

Multi-Agent Learning is a complex problem, especially in real-time systems. We address this problem by introducing Argumentation Accelerated Reinforcement Learning (AARL), which provides a methodology for defining heuristics, represented by arguments, and incorporates these heuristics into Reinforcement Learning (RL) by using reward shaping. We define AARL via argumentation and prove that it ca...

متن کامل

Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication

Traditional radio systems are strictly co-designed on the lower levels of the OSI stack for compatibility and efficiency. Although this has enabled the success of radio communications, it has also introduced lengthy standardization processes and imposed static allocation of the radio spectrum. Various initiatives have been undertaken by the research community to tackle the problem of artificial...

متن کامل

Levels of Realism for Cooperative Multi-Agent Reinforcement Learning

Training agents in a virtual crowd to achieve a task can be accomplished by allowing the agents to learn by trial-and-error and by sharing information with other agents. Since sharing enables agents to potentially reach optimal behavior more quickly, what type of sharing is best to use to achieve the quickest learning times? This paper categorizes sharing into three categories: realistic, unrea...

متن کامل

Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning

Deep reinforcement learning for multi-agent cooperation and competition has been a hot topic recently. This paper focuses on cooperative multi-agent problem based on actor-critic methods under local observations settings. Multi agent deep deterministic policy gradient obtained state of art results for some multi-agent games, whereas, it cannot scale well with growing amount of agents. In order ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i9.26288